Binning in Gaussian Kernel Regularization
نویسندگان
چکیده
Gaussian kernel regularization is widely used in the machine learning literature and has proved successful in many empirical experiments. The periodic version of Gaussian kernel regularization has been shown to be minimax rate optimal in estimating functions in any finite order Sobolev space. However, for a data set with n points, the computation complexity of the Gaussian kernel regularization method is of order O(n). In this paper we propose to use binning to reduce the computation of Gaussian kernel regularization in both regression and classification. For periodic Gaussian kernel regression, we show that the binned estimator achieves the same minimax rates as the unbinned estimator, but the computation is reduced to O(m) with m as the number of bins. To achieve the minimax rate in the k-th order Sobolev space, m needs to be in the order of O(kn), which makes the binned estimator computation of order O(n) for k = 1, and even less for larger k. Our simulations show that the binned estimator (binning 120 data points into 20 bins in our simulation) provides almost the same accuracy with only 0.4% of computation time. For classification, binning with L2-loss Gaussian kernel regularization and Gaussian kernel Support Vector Machines is tested in a polar cloud detection problem.
منابع مشابه
Rotation Invariant Angular Descriptor Via A Bandlimited Gaussian-like Kernel
We present a new smooth, Gaussian-like kernel that allows the kernel density estimate for an angular distribution to be exactly represented by a finite number of its Fourier series coefficients. Distributions of angular quantities, such as gradients, are a central part of several state-of-the-art image processing algorithms, but these distributions are usually described via histograms and there...
متن کاملOn Bochner's and Polya's Characterizations of Positive-Definite Kernels and the Respective Random Feature Maps
Positive-definite kernel functions are fundamental elements of kernel methods and Gaussian processes. A well-known construction of such functions comes from Bochner’s characterization, which connects a positive-definite function with a probability distribution. Another construction, which appears to have attracted less attention, is Polya’s criterion that characterizes a subset of these functio...
متن کاملFast Computation of Kernel Estimators
The computational complexity of evaluating the kernel density estimate (or its derivatives) at m evaluation points given n sample points scales quadratically as O(nm)–making it prohibitively expensive for large data sets. While approximate methods like binning could speed up the computation they lack a precise control over the accuracy of the approximation. There is no straightforward way of ch...
متن کاملStatistical Properties of the Method of Regularization with Periodic Gaussian Reproducing Kernel
The method of regularization with the Gaussian reproducing kernel is popular in the machine learning literature and successful in many practical applications. In this paper we consider the periodic version of the Gaussian kernel regularization. We show in the white noise model setting, that in function spaces of very smooth functions, such as the infinite-order Sobolev space and the space of an...
متن کاملStatistical Properties of the Method of Regularization with Periodic Gaussian Reproducing Kernel By
The method of regularization with the Gaussian reproducing kernel is popular in the machine learning literature and successful in many practical applications. In this paper we consider the periodic version of the Gaussian kernel regularization. We show in the white noise model setting, that in function spaces of very smooth functions, such as the infinite-order Sobolev space and the space of an...
متن کامل